JCSM Shareware Collection 1996 September

home *** CD-ROM | disk | FTP | other *** search

/ JCSM Shareware Collection 1996 September / JCSM Shareware Collection (JCS Distribution) (September 1996).ISO / prgtools / euphor13.zip / REFMAN.DOC < prev next >

Wrap

Text File | 1995-05-29 | 55KB | 1,326 lines

Euphoria Programming Language version 1.3 Reference Manual (c) 1995 Rapid Deployment Software Permission is freely granted to anyone to copy this manual. TABLE OF CONTENTS ================= Part I - Core Language - refman.doc 1. Introduction 1.1 Example Program 1.2 Installation 1.3 Running a Program 1.4 Editing a Program 1.5 Distributing a Program 2. Language Definition 2.1 Objects 2.2 Expressions 2.3 Declarations 2.4 Statements 2.5 Top-Level Commands 3. Debugging Part II - Library Routines - see library.doc 1. Introduction 2. Routines by Application Area 3. Alphabetical Listing of all Routines 1. Introduction =============== Euphoria is a new programming language with the following advantages over conventional languages: o a remarkably simple, flexible, powerful language definition that is extremely easy to learn and use. o dynamic storage allocation. Variables grow or shrink without the programmer having to worry about allocating and freeing chunks of memory. Objects of any size can be assigned to an element of a Euphoria sequence (array). o a high-performance, state-of-the-art interpreter that is 10 to 20 times faster than conventional interpreters such as Microsoft QBasic. o lightning-fast pre-compilation. Your program is checked for syntax and converted into an efficient internal form at over 12,000 lines per second on a 486-50. o extensive run-time checking for: out-of-bounds subscripts, uninitialized variables, bad parameter values for built-in functions, illegal value assigned to a variable and many more. There are no mysterious machine exceptions -- you will always get a full English description of any problem that occurs with your program at run-time, along with a call-stack trace-back and a dump of all of your variable values. Programs can be debugged quickly, easily and more thoroughly. o features of the underlying hardware are completely hidden. Programs are not aware of word-lengths, underlying bit-level representation of values, byte-order etc. Euphoria programs are therefore highly portable from one machine to another. o a full-screen source debugger and an execution profiler are included, along with a full-screen, multi-file editor. On a color monitor, the editor displays Euphoria programs in multiple colors, to highlight comments, reserved words, built-in functions, strings, and level of nesting of brackets. It optionally performs auto-completion of statements, saving you typing effort and reducing syntax errors. This editor is written in Euphoria, and the source code is provided to you without restrictions. You are free to modify it, add features, and redistribute it as you wish. o Euphoria programs run under MS-DOS (or Windows or OS/2), but are not subject to any 64K or 640K memory limitations. You can create programs that use the full multi-megabyte memory of your computer. You can even set up a swap file for programs that need more memory than exists on your machine. o Euphoria routines are naturally generic. The example program below shows a single routine that will sort any type of data -- integers, floating-point numbers, strings etc. Euphoria is not an "Object-Oriented" language in the usual sense, yet it achieves many of the benefits of these languages in a much simpler way. 1.1 Example Program ------------------- The following is an example of a complete Euphoria program. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ sequence list, sorted_list function merge_sort(sequence x) -- put x into ascending order using a recursive merge sort integer n, mid sequence merged, a, b n = length(x) if n = 0 or n = 1 then return x -- trivial case end if mid = floor(n/2) a = merge_sort(x[1..mid]) -- sort first half of x b = merge_sort(x[mid+1..n]) -- sort second half of x -- merge the two sorted halves into one merged = {} while length(a) > 0 and length(b) > 0 do if compare(a[1], b[1]) < 0 then merged = append(merged, a[1]) a = a[2..length(a)] else merged = append(merged, b[1]) b = b[2..length(b)] end if end while return merged & a & b -- merged data plus leftovers end function procedure print_sorted_list() -- generate sorted_list from list list = {9, 10, 3, 1, 4, 5, 8, 7, 6, 2} sorted_list = merge_sort(list) ? sorted_list end procedure print_sorted_list() -- this command starts the program ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The above example contains 4 separate commands that are processed in order. The first declares two variables: list and sorted_list to be sequences. The second defines a function merge_sort(). The third defines a procedure print_sorted_list(). The final command calls procedure print_sorted_list(). The output from the program will be: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. merge_sort() will just as easily sort {1.5, -9, 1e6, 100} or {"oranges", "apples", "bananas"} . This example is stored as euphoria\demo\example.ex. This is not the fastest way to sort in Euphoria. Go to the euphoria\demo directory and type "ex allsorts" to see timings on several different sorting algorithms for increasing numbers of objects. For a quick tutorial example of Euphoria programming see euphoria\demo\bench\filesort.ex. 1.2 Installation ---------------- To install Euphoria on your machine, first read the file install.doc. Installation simply involves copying the euphoria files to your hard disk under a directory named "EUPHORIA", and then modifying your autoexec.bat file so that EUPHORIA\BIN is on your search path, and the environment variable EUDIR is set to the EUPHORIA directory. An automatic install program, "install.bat" is provided for this purpose. For the latest details, please read the instructions in install.doc before you run install.bat. When installed, the euphoria directory will look something like this: euphoria readme.doc \bin ex.exe, dos4gw.exe, ed.bat, other utilities \include standard include files, e.g. graphics.e \doc ed.doc, refman.doc etc. \demo demo programs, e.g. ttt.ex, mset.ex, plot3d.ex \langwar language war game, lw.ex \bench benchmark programs 1.3 Running a Program ------------------------ Euphoria programs are executed by typing "ex", followed by the name of the main (or only) file. By convention, main Euphoria files have an extension of ".ex". Other Euphoria files, that are meant to be included in a larger program, end in ".e". To save typing, you can leave off the ".ex", and the ex command will supply it for you automatically. If the file can't be found in the current directory, your PATH will be searched. There are no command-line options for ex itself, but your program can call the built-in function command_line() to read the ex command-line. You can redirect standard input and standard output when you run a Euphoria program, for example: ex filesort.ex < raw > sorted or simply, ex filesort < raw > sorted For frequently-used programs you might want to make a small .bat file containing something like: @echo off ex myprog.ex %1 %2 where myprog.ex expects two command-line arguments. This will save you from typing ex all the time. ex.exe is in the euphoria\bin directory which must be on your search path. The file dos4gw.exe must also be present in the bin directory (or somewhere on the search path). Some Euphoria programs expect the environment variable EUDIR to be set to the main Euphoria directory. Running Under Windows --------------------- You can run Euphoria programs directly from the Windows environment, or from a DOS shell that you have opened from Windows. By "associating" .ex files with ex.exe, you can simply double-click on a .ex file to run it. It is possible to have several Euphoria programs active in different windows. You can resize these windows, move them around, change to a different font, run things in the background, copy and paste between windows etc. See your Windows manual. The Euphoria editor is available. You might want to associate .e, .pro and other text files with ed.bat. Also, the File-menu/ Run-command will let you type in ex or ed command lines. Use of a swap file ------------------ If you run a Euphoria program under Windows (or in a DOS shell under Windows) and the program runs out of physical memory, it will start using "virtual memory". Windows provides this virtual memory by swapping out least-recently-used data to a swap file. To change the size of the Windows swap file, click on Control Panel / 386 Enhanced / "virtual memory...". Under DOS, outside of Windows, there is normally no swap file. However by typing: swapon you can tell the DOS4GW DOS-extender to create a temporary 16-megabyte swap file for use by Euphoria programs. This file is created when each Euphoria program starts, and is deleted when the program terminates. Type: swapoff to turn off this swapping feature. Do not enable swapping unless your program needs it, as it adds a bit of overhead to the startup and termination of each Euphoria program. When disk swapping activity occurs your program will run correctly but will slow down. A better approach may be to free up more extended memory by cutting back on SMARTDRV and other programs that reserve large amounts of extended memory for themselves. 1.4 Editing a Program --------------------- You can use any text editor to edit a Euphoria program. However, Euphoria comes with its own special editor that is written entirely in Euphoria. Type: ed followed by the complete name of the file you wish to edit (the .ex extension is not assumed). You can use this editor to edit any kind of text file. When you edit a .e or .ex file some extra features, such as color syntax highlighting and auto-completion of certain statements, are available to make your job easier. Whenever you run a Euphoria program and get an error message, during compilation or execution, you can simply type ed with no file name and you will be automatically positioned in the file containing the error, at the correct line and column, and with the error message displayed at the top of the screen. Under Windows you can associate ed.bat with various kinds of text files that you want to edit. Most keys that you type are inserted into the file at the cursor position. Hit the Esc key once to get a menu bar of special commands. The arrow keys, and the Insert/Delete Home/End PageUp/PageDown keys are also active. See the file euphoria\doc\ed.doc for a complete description of the editing commands. Esc h (help) will let you view ed.doc from your editing session. If you need to understand or modify any detail of the editor's operation, you can edit the file ed.ex in euphoria\bin (be sure to make a backup copy so you don't lose your ability to edit). If the name ed conflicts with some other command on your system, simply rename the file euphoria\bin\ed.bat to something else. Because this editor is written in Euphoria, it is remarkably concise and easy to understand. The same functionality implemented in a language like C, would take far more lines of code. 1.5 Distributing a Program -------------------------- Your customer needs to have the 2 files: ex.exe and dos4gw.exe somewhere on the search path. You are free to supply anyone with the Public Domain Edition of ex.exe, as well as dos4gw.exe to support it. Your program can be distributed in source form or in shrouded form. In source form you supply your Euphoria files plus any standard include files that are required. To deliver a program in shrouded form, you run the Euphoria source code shrouder, bin\shroud.ex, against your main Euphoria file. The shrouder pulls together all included files into a single compact file that is virtually unreadable. You then ship this one file plus a copy of ex.exe and dos4gw.exe. One copy of ex.exe and dos4gw.exe on a machine is sufficient to run any number of Euphoria programs. Comments in bin\shroud.ex tell you how to run it, and what it does to obscure or "shroud" your source. 2. Language Definition ====================== 2.1 Objects ----------- All data objects in Euphoria are either atoms or sequences. An atom is a single numeric value. A sequence is an ordered list of data objects. The objects contained in a sequence can be an arbitrary mix of atoms or sequences. A sequence is represented by a list of objects in brace brackets, separated by commas. Atoms can have any integer or double-precision floating point value. They can range from approximately -1e300 to +1e300 with 15 decimal digits of accuracy. Here are some Euphoria objects: -- examples of atoms: 0 1000 98.6 -1e6 -- examples of sequences: {2, 3, 5, 7, 11, 13, 17, 19} -- 8-element sequence {1, 2, {3, 3, 3}, 4, {5, {6}}} -- 5-element sequence {{"jon", "smith"}, 52389, 97.25} -- 3-element sequence {} -- 0-element sequence Numbers can also be entered in hexadecimal. For example: #FE -- 254 #A000 -- 40960 #FFFF00008 -- 68718428168 -#10 -- -16 Sequences can be nested to any depth. Brace brackets are used to construct sequences out of a list of expressions. These expressions are evaluated at run-time. e.g. {x+6, 9, y*w+2, sin(0.5)} The "Hierarchical Objects" part of the Euphoria acronym comes from the hierarchical nature of nested sequences. This should not be confused with the class hierarchies of certain object-oriented languages. Performance Note: The Euphoria interpreter will store integer-valued atoms as machine integers to save time and space. Character Strings ----------------- Character strings may be entered using quotes e.g. "ABCDEFG" Strings are just sequences of characters, and may be manipulated and operated upon just like any other sequences. For example the above string is equivalent to the sequence {65, 66, 67, 68, 69, 70, 71} which contains the corresponding ASCII codes. Similarly, "" is equivalent to {}. Both represent the sequence of length-0. As a matter of programming style, it is natural to use "" to suggest a length-0 sequence of characters, and {} to suggest some other kind of sequence. Individual characters may be entered using single quotes if it is desired that they be treated as individual numbers (atoms) and not length-1 sequences. e.g. 'B' -- equivalent to the atom 66 "B" -- equivalent to the sequence {66} Note that an atom is not equivalent to a one-element sequence containing the same value, although there are a few built-in routines that choose to treat them similarly. Special characters may be entered using a back-slash: \n newline \r carriage return \t tab \\ backslash \" double quote \' single quote For example, "Hello, World!\n", or '\\'. The Euphoria editor displays character strings in brown. Comments -------- Comments are started by two dashes and extend to the end of the current line. e.g. -- this is a comment Comments are ignored by the compiler and have no effect on execution speed. The editor displays comments in red. In this manual we use italics. 2.2 Expressions --------------- Objects can be combined into expressions using binary and unary operators as well as built-in and user-defined functions. For example, {1,2,3} + 5 is an expression that adds the sequence {1,2,3} and the atom 5 to get the resulting sequence {6,7,8}. Besides + there are many other operators. The precedence of operators is as follows: highest precedence: function/type calls unary- unary+ not * / + - & < > <= >= = != lowest precedence: and, or Thus 2+6*3 means 2+(6*3), not (2+6)*3. Operators on the same line above have equal precedence and are evaluated left to right. Relational & Logical Operators ------------------------------ The relational operators, <, >, <=, >=, = , != each produce a 1 (true) or a 0 (false) result. These results can be used by the logical operators 'and', 'or', and 'not' to determine an overall truth value. e.g. b > 0 and b != 100 or not (c <= 5) where b and c are the names of variables. Subscripting of Sequences ------------------------- A single element of a sequence may be selected by giving the element number in square brackets. Element numbers start at 1. Non-integer subscripts are rounded down to an integer. For example, if x contains {5, 7, 9, 11, 13} then x[2] is 7. Suppose we assign something different to x[2]: x[2] = {11,22,33} Then x becomes: {5, {11,22,33}, 9, 11, 13}. Now if we ask for x[2] we get {11,22,33} and if we ask for x[2][3] we get the atom 33. If you try to subscript with a number that is outside of the range 1 to the number of elements, you will get a subscript error. For example x[0], x[-99] or x[6] will cause errors. So will x[1][3] since x[1] is not a sequence. There is no limit to the number of subscripts that may follow a variable, but the variable must contain sequences that are nested deeply enough. The two dimensional array, common in other languages, can be easily simulated with a sequence of sequences: { {5, 6, 7, 8, 9}, {1, 2, 3, 4, 5}, {0, 1, 0, 1, 0} } An expression of the form x[i][j] can be used to access any element. The two dimensions are not symmetric however, since an entire "row" can be selected with x[i], but there is no simple expression to select an entire column. Other logical structures, such as n-dimensional arrays, arrays of strings, arrays of structures etc. can also be handled easily and flexibly. Note that expressions in general may not be subscripted, just variables. For example: {5,6,7,8}[3] is not supported. Slicing of Sequences -------------------- A sequence of consecutive elements may be selected by giving the starting and ending element numbers. For example if x is {1, 1, 2, 2, 2, 1, 1, 1} then x[3..5] is the sequence {2, 2, 2}. x[3..3] is the sequence {2}. x[3..2] is also allowed. It evaluates to the length-0 sequence {}. If y has the value: {"fred", "george", "mary"} then y[1..2] is {"fred", "george"}. We can also use slices for overwriting portions of variables. After x[3..5] = {9, 9, 9} x would be {1, 1, 9, 9, 9, 1, 1, 1}. We could also have said x[3..5] = 9 with the same effect. Suppose y is {0, "Euphoria", 1, 1}. Then y[2][1..4] is "Euph". If we say y[2][1..4]="ABCD" then y will become {0, "ABCDoria", 1, 1}. We need to be a bit more precise in defining the rules for empty slices. Consider a slice s[i..j] where s is of length n. A slice from i to j, where j = i-1 and i >= 1 produces the empty sequence, even if i = n+1. Thus 1..0 and n+1..n and everything in between are legal (empty) slices. Empty slices are quite useful in many algorithms. A slice from i to j where j < i - 1 is illegal , i.e. "reverse" slices such as s[5..3] are not allowed. Concatenation of Sequences and Atoms ------------------------------------ Any two objects may be concatenated using the & operator. The result is a sequence with a length equal to the sum of the lengths of the concatenated objects (where atoms are considered here to have length 1). e.g. {1, 2, 3} & 4 -- result is {1, 2, 3, 4} 4 & 5 -- result is {4, 5} {{1, 1}, 2, 3} & {4, 5} -- result is {{1, 1}, 2, 3, 4, 5} x = {} y = {1, 2} y = y & x -- y is still {1, 2} Arithmetic Operations on Sequences ---------------------------------- Any binary or unary arithmetic operation, including any of the built-in math routines, may be applied to entire sequences as well as to single numbers. When applied to a sequence, a unary operator is actually applied to each element in the sequence to yield a sequence of results of the same length. If one of these elements is itself a sequence then the same rule is applied recursively. e.g. x = -{1, 2, 3, {4, 5}} -- x is {-1, -2, -3, {-4, -5}} If a binary operator has operands which are both sequences then the two sequences must be of the same length. The binary operation is then applied to corresponding elements taken from the two sequences to get a sequence of results. e.g. x = {5, 6, 7 {1, 1}} + {10, 10, 20, 100} -- x is {15, 16, 27, {101, 101}} If a binary operator has one operand which is a sequence while the other is a single number (atom) then the single number is effectively repeated to form a sequence of equal length to the sequence operand. The rules for operating on two sequences then apply. Some examples: y = {4, 5, 6} w = 5 * y -- w is {20, 25, 30} x = {1, 2, 3} z = x + y -- z is {5, 7, 9} z = x < y -- z is {1, 1, 1} w = {{1, 2}, {3, 4}, {5}} w = w * y -- w is {{4, 8}, {15, 20}, {30}} Comparison of Euphoria Objects with Other Languages --------------------------------------------------- By basing Euphoria on this one, simple, general, recursive data structure, a tremendous amount of the complexity normally found in programming languages has been avoided. The arrays, record structures, unions, arrays of records, multidimensional arrays, etc. of other languages can all be easily simulated in Euphoria with sequences. So can higher-level structures such as lists, stacks, queues, trees etc. Furthermore, in Euphoria you can have sequences of mixed type; you can assign any object to an element of a sequence; and sequences easily grow or shrink in length without your having to worry about storage allocation issues. The exact layout of a data structure does not have to be declared in advance, and can change dynamically as required. It is easy to write generic code, where, for instance, you push or pop a mix of various kinds of data objects using a single stack. Data structure manipulations are very efficient since Euphoria will point to large data objects rather than copy them. Programming in Euphoria is based entirely on creating and manipulating flexible, dynamic sequences of data. Sequences are it - there are no other data structures to learn. You operate in a simple, safe, elastic world of *values*, that is far removed from the rigid, tedious, dangerous world of bits, bytes, pointers and machine crashes. Unlike other languages such as LISP and Smalltalk, Euphoria's "garbage collection" of unused storage is a continuous process that never causes random delays in execution of a program, and does not pre-allocate huge regions of memory. The language definitions of conventional languages such as C, C++, Ada, etc. are very complex. Most programmers become fluent in only a subset of the language. The ANSI standards for these languages read like complex legal documents. You are forced to write different code for different data types simply to copy the data, ask for its current length, concatenate it, compare it etc. The manuals for those languages are packed with routines such as "strcpy", "strncpy", "memcpy", "strcat", "strlen", "strcmp", "memcmp", etc. that each only work on one of the many types of data. Much of the complexity surrounds issues of data type. How do you define new types? Which types of data can be mixed? How do you convert one type into another in a way that will keep the compiler happy? When you need to do something requiring flexibility at runtime, you frequently find yourself trying to fake out the compiler. In these languages the numeric value 4 (for example) can have a different meaning depending on whether it is an int, a char, a short, a double, an int * etc.. In Euphoria, 4 is the atom 4, period. Euphoria has something called types as we shall see later, but it is a much simpler concept. Issues of dynamic storage allocation and deallocation consume a great deal of programmer coding time and debugging time in these other languages, and make the resulting programs much harder to understand. Pointer variables are extensively used. The pointer has been called the "go to" of data structures. It forces programmers to think of data as being bound to a fixed memory location where it can be manipulated in all sorts of low-level non-portable, tricky ways. A picture of the actual hardware that your program will run on is never far from your mind. Euphoria does not have pointers and does not need them. 2.3 Declarations ---------------- Identifiers ----------- Variable names and other user-defined symbols (identifiers) may be of any length. Upper and lower case are distinct. Identifiers must start with a letter and then be followed by letters, digits or underscores. The following reserved words have special meaning in Euphoria and may not be used as identifiers: and end include to by exit not type constant for or while do function procedure with else global return without elsif if then The Euphoria editor displays these words in blue. The following kinds of user-defined symbols may be declared in a program: o procedures These perform some computation and may have a list of parameters, e.g. procedure empty() end procedure procedure plot(integer x, integer y) position(x, y) puts(1, '*') end procedure There are a fixed number of named parameters, but this is not restrictive since any parameter could be a variable-length sequence of arbitrary objects. In many languages variable-length parameter lists are impossible. In C, you must set up strange mechanisms that are complex enough that the average programmer cannot do it without consulting a manual or a local guru. A copy of the value of each argument is passed in. The formal parameter variables may be modified inside the procedure but this does not affect the value of the arguments. Performance Note: The interpreter does not actually copy sequences or floating-point numbers unless it becomes necessary. For example, y = {1,2,3,4,5,6,7,8.5,"ABC"} x = y The statement x = y does not actually cause a new copy of y to be created. Both x and y will simply "point" to the same sequence. If we later perform x[3] = 9, then a separate sequence will be created for x in memory (although there will still be just one shared copy of 8.5 and "ABC"). The same thing applies to "copies" of arguments passed in to subroutines. o functions These are just like procedures, but they return a value, and can be used in an expression, e.g. function max(atom a, atom b) if a >= b then return a else return b end if end function Any Euphoria object can be returned. You can, in effect, have multiple return values, by returning a sequence of objects. e.g. return {quotient, remainder} We will use the general term "subroutine", or simply "routine" when a remark is applicable to both procedures and functions. o types These are special functions that may be used in declaring the allowed values for a variable. A type must have exactly one parameter and should return an atom that is either TRUE (non-zero) or FALSE (zero). Types can also be called just like other functions. They are discussed in more detail below. o variables These may be assigned values during execution e.g. integer x x = 25 object a, b, c a = {} b = a c = 0 o constants These are variables that are assigned an initial value that can never change e.g. constant MAX = 100 constant Upper = MAX - 10, Lower = 5 The result of any expression can be assigned to a constant, even one involving calls to previously defined functions, but once the assignment is made the value of the constant variable is "locked in". Scope ----- Every symbol must be declared before it is used. This is restrictive, but it has benefits. It means you always know in which direction to look for the definition of a subroutine or variable that is used at some point in the program. When looking at a subroutine definition, you know that there could not be a call to this routine from any routine defined earlier. In general, it forces you to organize your program into a hierarchy where there are distinct, "layers" of low-level, followed by higher-level routines. You can replace a layer without disrupting any lower layers. A symbol is defined from the point where it is declared to the end of its scope. The scope of a variable declared inside a procedure or function (a private variable) ends at the end of the procedure or function. The scope of all other constants, procedures, functions and variables ends at the end of the source file in which they are declared and they are referred to as local, unless the word global precedes their declaration, in which case their scope extends indefinitely. Procedures and functions can call themselves recursively. Constant declarations must be outside of any subroutine. Variable declarations inside a subroutine must all appear at the beginning, before the executable statements of the subroutine. A special case is that of the controlling variable used in a for-loop. It is automatically declared at the beginning of the loop, and its scope ends at the end of the for-loop. If the loop is inside a function or procedure, the loop variable is a private variable and may not have the same name as any other private variable. When the loop is at the top level, outside of any function or procedure, the loop variable is a local variable and may not have the same name as any other global or local variable in that file. You do not declare loop variables as you would other variables. The range of values specified in the for statement defines the legal values of the loop variable - specifying a type would be redundant and is not allowed. Specifying the type of a variable --------------------------------- Variable declarations have a type name followed by a list of the variables being declared. For example, object a global integer x, y, z procedure fred(sequence q, sequence r) In a parameter list like the one above, the type name may only be followed by a single variable name. The types: object, sequence, atom and integer are predefined. Variables of type object may take on any value. Those declared with type sequence must always be sequences. Those declared with type atom must always be atoms. Those declared with type integer must be atoms with integer values from -1073741824 to +1073741823 inclusive. You can perform exact calculations on larger integer values, up to about 15 decimal digits, but declare them as atom, rather than integer. Performance Note: Calculations using variables declared as integer will usually be somewhat faster than calculations involving variables declared as atom. If your machine has floating-point hardware, Euphoria will use it to manipulate atoms that aren't representable as integers. If your machine doesn't have floating-point hardware, Euphoria will call software floating-point emulation routines contained in ex.exe. You can force Euphoria to bypass any floating-point hardware, by setting an environment variable: SET NO87=1 The slower software routines will be used, but this could be of some advantage if you are worried about the floating-point bug in some Pentium chips. To augment the predefined types, you can create new types. All you have to do is define a single-parameter function, but declare it with type ... end type instead of function ... end function. For example, type hour(integer x) return x >= 0 and x <= 23 end type hour h1, h2 This guarantees that variables h1 and h2 can only be assigned integer values in the range 0 to 23 inclusive. After an assignment to h1 or h2 the interpreter will call "hour()", passing the new value. The parameter x will first be checked to see if it is an integer. If it is, the return statement will be executed to test the value of x (i.e. the new value of h1 or h2). If "hour" returns true, execution continues normally. If "hour" returns false then the program is aborted with a suitable diagnostic message. procedure set_time(hour h) set_time() above can only be called with a reasonable value for parameter h. A variable's type will be checked after each assignment to the variable (except where the compiler can predetermine that a check will not be necessary), and the program will terminate immediately if the type function returns false. Subroutine parameter types are checked when the subroutine is called. This checking guarantees that a variable can never have a value that does not belong to the type of that variable. Unlike other languages, the type of a variable does not affect any calculations on the variable. Only the value of the variable matters in an expression. The type just serves as an error check to prevent any "corruption" of the variable. Type checking can be turned off or on in between subroutines using the "with type_check" or "without type_check" commands. It is initially on by default. Note to Benchmarkers: When comparing the speed of Euphoria programs against programs written in other languages, you should specify without type_check at the top of the file, unless the other language provides a comparable amount of run-time checking. This gives Euphoria permission to skip runtime type checks, thereby saving some execution time. All other checks are still performed, e.g. subscript checking, uninitialized variable checking etc. Even when you turn off type checking, Euphoria reserves the right to make checks at strategic places, since this can actually allow it to run your program faster in many cases. So you may still get a type check failure even when you have turned off type checking. With or without type_check, you will never get a machine-level exception. You will always get a meaningful message from Euphoria when something goes wrong. Euphoria's method of defining types is much simpler than what you will find in other languages, yet Euphoria provides the programmer with greater flexibility in defining the legal values for a type of data. Any algorithm can be used to include or exclude values. You can even declare a variable to be of type object which will allow it to take on any value. Routines can be written to work with very specific types, or very general types. Strict type definitions can greatly aid the process of debugging. Logic errors are caught close to their source and are not allowed to propagate in subtle ways through the rest of the program. Furthermore, it is much easier to reason about the misbehavior of a section of code when you are guaranteed that the variables involved always had a legal value, if not the desired value. Types also provide meaningful, machine-checkable documentation about your program, making it easier for you or others to understand your code at a later date. Combined with the subscript checking, uninitialized variable checking, and other checking that is always present, strict run-time type checking makes debugging much easier in Euphoria than in most other languages. It also increases the reliability of the final program since many latent bugs that would have survived the testing phase in other languages will have been caught by Euphoria. Anecdote 1: In porting a large C program to Euphoria, a number of latent bugs were discovered. Although this C program was believed to be totally "correct", we found: a situation where an uninitialized variable was being read; a place where element number "-1" of an array was routinely written and read; and a situation where something was written just off the screen. These problems resulted in errors that weren't easily visible to a casual observer, so they had survived testing of the C code. Anecdote 2: The Quick Sort algorithm presented on page 117 of Writing Efficient Programs by Jon Bentley has a subscript error! The algorithm will sometimes read the element just before the beginning of the array to be sorted, and will sometimes read the element just after the end of the array. Whatever garbage is read, the algorithm will still work - this is probably why the bug was never caught. But what if there isn't any (virtual) memory just before or just after the array? Bentley later modifies the algorithm such that this bug goes away -- but he presented this version as being correct. Even the experts need subscript checking! Performance Note: When typical user-defined types are used extensively, type checking adds only 20 to 40 percent to execution time. Leave it on unless you really need the extra speed. You might also consider turning it off for just a few heavily-executed routines. Profiling can help with this decision. 2.4 Statements -------------- The following kinds of executable statements are available: o assignment statement o procedure call o if statement o while statement o for statement o return statement o exit statement Semicolons are not used in Euphoria, but you are free to put as many statements as you like on one line, or to split a single statement across many lines. You may not split a statement in the middle of a variable name, string, number or keyword. An assignment statement assigns the value of an expression to a simple variable, or to a subscript or slice of a variable. e.g. x = a + b y[i] = y[i] + 1 y[i..j] = {1, 2, 3} The previous value of the variable, or element(s) of the subscripted or sliced variable are discarded. For example, suppose x was a 1000-element sequence that we had initialized with: object x x = repeat(0, 1000) -- repeat 0, 1000 times and then later we assigned an atom to x with: x = 7 This is perfectly legal since x is declared as an object. The previous value of x, namely the 1000-element sequence, would simply disappear. Actually, the space consumed by the 1000-element sequence will be automatically recycled due to Euphoria's dynamic storage allocation. A procedure call starts execution of a procedure, passing it an optional list of argument values. e.g. plot(x, 23) An if statement tests an expression to see if it is 0 (false) or non-zero (true) and then executes the appropriate series of statements. There may be optional elsif and else clauses. e.g. if a < b then x = 1 end if if a = 9 then x = 4 y = 5 else z = 8 end if if char = 'a' then x = 1 elsif char = 'b' then x = 2 elsif char = 'c' then x = 3 else x = -1 end if A while statement tests an expression to see if it is non-zero (true), and while it is true a loop is executed. e.g. while x > 0 do a = a * 2 x = x - 1 end while A for statement sets up a special loop with a controlling loop variable that runs from an initial value up or down to some final value. e.g. for i = 1 to 10 do ? i -- ? is a short form for print() -- see library.doc end for for i = 10 to 20 by 2 do for j = 20 to 10 by -2 do ? {i, j} end for end for The loop variable is declared automatically and exists until the end of the loop. Outside of the loop the variable has no value and is not even declared. If you need its final value, copy it into another variable before leaving the loop. The compiler will not allow any assignments to a loop variable. The initial value, loop limit and increment must all be atoms. If no increment is specified then +1 is assumed. The limit and increment values are established when the loop is entered, and are not affected by anything that happens during the execution of the loop. A return statement returns from a subroutine. If the subroutine is a function or type then a value must also be returned. e.g. return return {50, "FRED", {}} An exit statement may appear inside a while-loop or a for-loop. It causes immediate termination of the loop, with control passing to the first statement after the loop. e.g. for i = 1 to 100 do if a[i] = x then location = i exit end if end for It is also quite common to see something like this: constant TRUE = 1 while TRUE do ... if some_condition then exit end if ... end while i.e. an "infinite" while-loop that actually terminates via an exit statement at some arbitrary point in the body of the loop. 2.5 Top-Level Commands ---------------------- Euphoria processes your .ex file in one pass, starting at the first line and proceeding through to the last line. When a procedure or function definition is encountered, the routine is checked for syntax and converted into an internal form, but no execution takes place. When a statement that is outside of any routine is encountered, it is checked for syntax, converted into an internal form and then immediately executed. If your .ex file contains only routine definitions, but no immediate execution statements, then nothing will happen when you try to run it (other than syntax checking). You need to have an immediate statement to call your main routine (see the example program in section 1.1). It is quite possible to have a .ex file with nothing but immediate statements, for example you might want to use Euphoria as a desk calculator, typing in just one print (or ?) statement into a file, and then executing it. The langwar demo program (euphoria\demo\langwar\lw.ex) quickly reads in and displays a file on the screen, before the rest of the program is compiled (on a 486 or higher this makes little difference as the compiler takes less than a second to finish compiling the entire program). Another common practice is to immediately initialize a global variable, just after its declaration. The following special commands may only appear at the top level i.e. outside of any function or procedure. As we have seen, it is also possible to use any Euphoria statement, including for-loops, while-loops, if statements etc. (but not return), at the top level. include filename - reads in (compiles) a Euphoria source file in the presence of any global symbols that have already been defined. Global symbols defined in the included file remain visible in the remainder of the program. If an absolute pathname is given, Euphoria will use it. When a relative pathname is given, Euphoria will first look for filename in the same directory as the main file given on the ex command line. If it's not there, it will look in %EUDIR%\include, where EUDIR is the environment variable that must be set when using Euphoria. This directory contains the standard Euphoria include files. An include statement will be quietly ignored if the file has already been included, directly or indirectly. with - turns on one of the compile options: profile, trace, warning or type_check. Options warning and type_check are initially on, while profile and trace are initially off. without - turns off one of the above options. Note that each of these options may be turned on or off between subroutines but not inside of a subroutine. These options apply globally. For example if you have: without type_check include graphics.e then type checking will be turned off inside graphics.e as well as in the current file. Profiling --------- If you specify "with profile" then an execution profile will be produced when your program finishes execution. It is written to the file "ex.pro" in the current directory. A profile is a listing of your program showing the number of times each statement was executed. Only statements compiled "with profile" will be shown. Normally you will say "with profile" at the top of your main .ex file, so you can get a complete listing. View this file with the Euphoria editor to see a color display. Profiling can help you in many ways: it lets you see which statements are heavily executed, so you can try to speed up your program; it lets you verify that your program is actually working the way you intended; and it lets you see which sections of code were not tested - don't let your users be the first! Redirecting Standard Input and Standard Output ---------------------------------------------- Routines such as gets() and puts() can use standard input (file #0), standard output (file #1), and standard error output (file #2). Standard input and output can then be redirected as in: ex myprog < myinput > myoutput See the I/O routines in Part II section 2.6 for more details. 3. Debugging ============ Debugging in Euphoria is much easier than in most other programming languages. The extensive runtime checking provided at all times by Euphoria automatically catches many bugs that in other languages might take hours of your time to track down. When Euphoria catches an error, you will always get a brief report on your screen, and a detailed report in a file called "ex.err". These reports always include a full English description of what happened, along with a call-stack traceback. The file ex.err will also have a dump of all variable values, and optionally a list of the most recently executed statements. For extremely large sequences, only a partial dump is shown. In addition, you are able to create user-defined types that precisely determine the set of legal values for each of your variables. An error report will occur the moment that one your variables is assigned an illegal value. Sometimes a program will misbehave without failing any runtime checks. In any programming language it may be a good idea to simply study the source code and rethink the algorithm that you have coded. It may also be useful to insert print statements at strategic locations in order to monitor the internal logic of the program. This approach is particularly convenient in an interpreted language like Euphoria since you can simply edit the source and rerun the program without waiting for a recompile/relink. Euphoria provides you with additional powerful tools for debugging. You can trace the execution of your program source code on one screen while you witness the output of your program on another. with trace / without trace commands select the subroutines in your program that are available for tracing. Often you will simply insert a "with trace" command at the very beginning of your source code to make it all traceable. Sometimes it is better to place the first "with trace" after all of your user-defined types, so you don't trace into these routines after each assignment to a variable. At other times, you may know exactly which routine or routines you are interested in tracing, and you will want to select only these ones. Of course, once you are in the trace window you can interactively skip over the execution of any routine by pressing down-arrow on the keyboard rather than Enter. Only traceable lines can appear in ex.err as "most-recently-executed lines" should a runtime error occur. If you want this information and didn't get it, you should insert a "with trace" and then rerun your program. Execution will be a bit slower when lines compiled "with trace" are executed. After you have predetermined the lines that are traceable, your program must then dynamically cause the trace facility to be activated by executing a trace(1) statement. Again, you could simply say: with trace trace(1) -- or trace(2) if you prefer a mono display at the top of your program, so you can start tracing from the beginning of execution. More commonly, you will want to trigger tracing when a certain routine is entered, or when some condition arises. e.g. if x < 0 then trace(1) end if You can turn off tracing by executing a trace(0) statement. You can also turn it off interactively by typing 'q' to quit tracing. Remember that "with trace" must appear outside of any routine, whereas trace(1) and trace(0) can appear inside a routine or outside. You might want to turn on tracing from within a type. Suppose you run your program and it fails, with the ex.err file showing that one of your variables has been set to a strange, although not illegal value, and you wonder how it could have happened. Simply create a type for that variable that executes trace(1) if the value being assigned to the variable is the strange one that you are interested in. e.g. type positive_int(integer x) if x = 99 then trace(1) -- how can this be??? return 1 -- keep going else return x > 0 end if end type You will then be able to see the exact statement that caused your variable to be set to the strange value, and you will be able to check the values of other variables. You will also be able to check the output screen to see what has been happening up to this precise moment. If you make your special type return 0 for the strange value instead of 1, you can force a dump into ex.err. The Trace Screen ---------------- When a trace(1) statement is executed, your main output screen is saved and a trace screen appears. It shows a view of your program with the statement that will be executed next highlighted, and several statements before and after showing as well. Several lines at the bottom of the screen are reserved for displaying variable names and values. The top line shows the commands that you can enter at this point: F1 - display main output screen - take a look at your program's output so far F2 - redisplay trace screen. Press this key while viewing the main output screen to return to the trace display. Enter - execute the currently-highlighted statement only down-arrow - continue execution and break when any statement coming after this one in the source listing is executed. This lets you skip over subroutine calls. It also lets you force your way out of repetitive loops. ? - display the value of a variable. Many variables are displayed automatically as they are assigned a value, but sometimes you will have to explicitly ask for one that is not on display. After hitting ? you will be prompted for the name of the variable. Variables that are not defined at this point cannot be shown. Variables that have not yet been initialized will have <NO VALUE> beside their name. q - quit tracing and resume normal execution. Tracing will start again when the next trace(1) is executed. ! - this will abort execution of your program. A traceback and dump of variable values will go to ex.err. As you trace your program, variable names and values appear automatically in the bottom portion of the screen. Whenever a variable is assigned-to you will see its name and new value appear at the bottom. This value is always kept up-to-date. Private variables are automatically cleared from the screen when their routine returns. When the variable display area is full, least-recently referenced variables will be discarded to make room for new variables. For your convenience, numbers that are in the range of printable ASCII characters (32-127) are displayed along with the ASCII character itself. The ASCII character will be in a different color (or in quotes in a mono display). This is done for all variables, since Euphoria does not know in general whether you are thinking of a number as an ASCII character or not. You will also see ASCII characters (in quotes) in ex.err. This can make for a rather "busy" display, but the ASCII information is often very useful. The trace screen adopts the same graphics mode as the main output screen. This makes flipping between them quicker and easier. When a traced program requests keyboard input, the main output screen will appear, to let you type your input as you normally would. This works fine for gets() input. When get_key() (quickly samples the keyboard) is called you will be given 10 seconds to type a character otherwise it is assumed that there is no input for this call to get_key(). This allows you to test the case of input and also the case of no input for get_key(). --- END OF PART I --- see library.doc for Part II - Library Routines